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Abstract 

We use detailed data from New York City to estimate how the characteristics of school principals relate 
to school performance, as measured by students’ standardized exam scores and other outcomes. We 
find little evidence of any relationship between school performance and principal education and pre- 
principal work experience, although we do find some evidence that experience as an assistant principal 
at the principal’s current school is associated with higher performance among inexperienced principals. 
However, we find a positive relationship between principal experience and school performance, 
particularly for math test scores and student absences. The experience profile is especially steep over 
the first few years of principal experience. Finally, we find mixed evidence on the relationship between 
formal principal training and professional development programs and school performance, with the 
caveat that the selection and assignment of New York City principals participating in these programs 
make it hard to isolate their effects. The positive returns to principal experience suggest that policies 
which cause principals to leave their posts early (e.g., via early retirement or a move into district 
administration) will be costly, and the tendency for less-advantaged schools to be run by less 
experienced principals could exacerbate educational inequality. 
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School Principals and School Performance 



Introduction 

State and local school accountability systems have become widespread, in part due to requirements of 
the federal No Child Left Behind legislation. The focus on schools, as opposed to school districts or 
teachers, presupposes that school-level policy decisions matter. These decisions are, in large part, 
determined by school principals, who have an important influence on the composition of the school 
workforce and course content, and who are responsible for monitoring the quality of instruction 
delivered by teachers. However, in contrast to the large literature on teacher quality (Rivkin, Hanushek, 
and Kain 2005; Rockoff 2004; Harris and Sass 2006; Kane, Rockoff, and Staiger 2008; Buddin and 
Zamarro 2009), few studies have addressed whether principals impact school performance and, if they 
do, which principal characteristics determine principal effectiveness. 

The literature on principals is sparse in part because of the difficulties faced in defining and 
measuring principal effectiveness and in part because of the paucity of high-quality data on which 
convincing empirical strategies can be based. In this paper we present new evidence on the relationship 
between principal characteristics and school performance using data from New York City Department 
of Education (hereafter NYC). There are a number of reasons why NYC is an especially attractive 
setting to study these relationships. First, it is the largest school district in the nation and employs well 
over 1,000 principals. Second, nearly all of its principals are hired from within the school system, and 
we have detailed information on their entire career as educators in NYC. Third, for elementary and 
middle schools, on which we focus, we have data on student outcomes covering eight school years, 
which allows us to examine how school performance varies over a principal’s career and how it 
changes when schools change principals. Fourth, since our data is at the student level, we can estimate 
models of school performance that control for student characteristics in a very flexible manner. A fifth 
reason to be interested in NYC is that, in 1991, the city implemented an unusual policy that generated 
quasi-experimental variation in principal characteristics across schools. We describe this program in 
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detail here, though our analysis of this experiment is ongoing. 

We use a variety of empirical strategies to estimate the relationship between principal 
characteristics and school performance. These strategies are non-experimental, and identify the causal 
effects of principal characteristics on school performance only under certain identifying assumptions 
which we make clear below. These non-experimental estimates are less informative than would be 
experimental estimates of the same parameters. Nevertheless, we see two reasons why they mark an 
important step forward for the literature on school principals. First, principal candidates are not 
randomly assigned to schools and it is difficult to conceive of natural experiments that will generate the 
quasi-random variation in principal characteristics required to identify causal effects without strong 
identifying assumptions. Second, our understanding of the relationship between principal characteristics 
and school performance is extremely limited. In our review of the related literature, we found few 
studies of whether principals influence school performance, and few convincing studies of the impact of 
specific principal characteristics such as education and experience . 1 Non-experimental estimates of 
these relationships can therefore provide valuable new information and inform policies relating to 
principal hiring and compensation. 

The main strategy that we use to estimate the relationship between principal characteristics and 
school performance exploits principal turnover within the same school (i.e., controls for school fixed 
effects). By comparing different principals working at the same school, we can hold constant all 
persistent differences across schools. In addition, to control for the possibility that different types of 
principals attract or are attracted to different types of students, we also include controls for pre- 
determined student characteristics (e.g., poverty and race), both at the individual and school level. 

For middle school students, these controls include the test scores they most recently obtained in 
elementary school. 

Our analysis leads to three main findings. First, we find little evidence of any relationship 

1 Studies of principals in Texas (Branch, Hanushek, and Rivkin 2008) and in British Columbia (Coelli and Green 2009), are 
two exceptions which we describe below. 



2 




between school performance and the selectivity of a principal’s undergraduate or graduate institution. 
We also find little relationship between performance and a principal’s prior work experience, with the 
exception that, among very inexperienced principals, school performance is higher among those that 
were previously assistant principals at their current school. Second, we find a positive relationship 
between principal experience and school performance, particularly for math test scores and student 
absences. The experience profile is especially steep over the first few years of principal experience. 
Third, we find mixed evidence on the relationship between principal training and professional 
development programs and school performance, with the caveat that the selection and assignment of 
principals participating in the NYC training programs make it hard to isolate their effects. 

We draw three conclusions from these findings. First, in regard to principal selection, our 
results suggest that characteristics that can be directly observed on a resume - such as the selectivity 
of the school from which a candidate received their master’s degree - are probably less important than 
characteristics that cannot, such as leadership skills and motivation. Second, in regard to principal 
retention, the positive returns to principal experience suggest that policies which cause principals to 
leave their posts early (e.g., via early retirement or a move into district administration) could lower 
school performance. Third, our results suggest that high rates of turnover in less-advantaged schools 
could exacerbate educational inequality. Principal training could improve the performance of new 
principals and further enhance the performance of more experienced principals, but determining the 
effects of training is complicated by non-random selection of individuals into these programs. 

Policy and Evidence 

Principal Promotion and Retention Policies 

Traditionally, educators seeking promotion to principal positions were required to serve time as 
teachers and assistant principals and accumulate the academic credits necessary to obtain the relevant 
certification. As discussed by Ballou and Podgursky (1993), this model occupies a middle ground 
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between a system with even greater “professionalization” - in which, among other things, individuals 
trying to become principals might be required to obtain doctorates in education - and a system with 
much lower entry barriers, in which potential principals could be promoted from non-traditional 
backgrounds, without formal education credentials but with management training and, perhaps, private 
sector experience. 

In recent years, school systems such as New York City have changed hiring procedures in ways 
that change the pool of individuals who become principals. These changes are based on the notion that 
principals need not have served the district for a long period of time in order to be effective leaders, and 
that talented educators should be promoted when they are considered ready to lead schools. In NYC, 
this has led to a dramatic change in the age profile of principals. For example, more than 130 of the 
roughly 1000 elementary and middle school principals in NYC are under the age of 35, up from less 
than 30 in 2002 (authors’ calculation based on data described below). A significant fraction of the 
younger principals in NYC also have bachelor and master degrees from Ivy League universities such as 
Columbia and Harvard, and we examine whether the selectivity of principals undergraduate and 
graduate institutions are predictive of school performance. 

New York City policies have also supported the idea that educators with leadership talent can be 
transformed into effective principals via intensive principal training programs. The most prominent of 
these is the Aspiring Principals Program (APP), a 14-month intensive training program designed to 
prepare educators for principal positions. Others in operation over our sample period include programs 
such as “New Leaders for New Schools”, “Tomorrow’s Principals” and the “Bank Street Academy”. In 
the school year 2006-2007, more than one half of newly recruited NYC principals had passed through 
one of these training programs.” 

While policy relating to principal’s subsequent careers has provoked less comment, there is an 

2 All individuals who became principals during our sample period must have completed state certification requirements via 
completion of a state approved program of study. Thus, principals trained in a program such as the APP will be compared 
primarily to principals who fulfilled certification requirements on their own initiative as well as the small number of 
principals who received certification via some other formal training program. 
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ongoing debate surrounding the optimal length of a principal’s career, clearly relevant to policies 
relating to principal salary structure and retirement incentives. Joel Klein, New York City’s Schools 
Chancellor, is on record as saying he would like good principals to stay in schools for eight to ten years 
(Gootman and Gebeloff 2009), significantly longer than the current median principal experience in 
NYC. The assumption that more-experienced principals are more effective also raises concerns that — as 
has been shown for teachers, e.g., Hanushek, Kain, and Rivkin (2004) — principal turnover may be 
higher at schools serving more disadvantaged students, and that, when experienced principals move 
within a district, they move to schools serving more advantaged students. We provide empirical 
confirmation that these patterns exist in NYC. 

Evidence on Principal Characteristics and School Performance 

There are several strands of literature concerned with the relationships between principal characteristics 
and school performance. One set of studies relates raw test scores to principal characteristics (Blank 
1987; Eberts and Stone 1988; Heck, Larsen, and Marcoulides 1990; Brewer 1993; Heck and 
Marcoulides 1993; Hallinger and Heck 1996; Waters, Marzano, and McNulty 2003; Witziers, Bosker, 
and Kruger 2003). These studies are limited in that they consider only a narrow range of principal 
characteristics, use only a small sample of schools, and do not adequately control for factors that 
confound the relationship between achievement and principal characteristics such as student 
demographics. A second strand of this literature examines other student-based outcomes such as 
attendance (Blank 1987) and student engagement (Leithwood and Jantzi 1999). These outcomes are 
also likely to be strongly influenced by student composition, making it difficult to draw conclusions 
from these studies without strong identification assumptions. Moreover, it is also unclear whether these 
measures represent inputs or outputs. 

A third strand of the literature measures school performance using teacher-based outcomes such 
as teachers’ evaluations of school principal performance (Ballou and Podgursky 1993) and teacher 
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mobility/attrition (Gates et al. 2006). Teacher ratings can overcome some of the limitations of student- 
based measures, to the extent that teachers are aware of factors such as the socio-economic background 
of the study body, but these evaluations are subjective, and may not be strongly related to student 
outcomes. 

These three sets of papers generate mixed evidence in regards to the relationship between school 
performance and principal characteristics. In terms of principal education, both Eberts and Stone (1988) 
and Ballou and Podgursky (1993) find a negative correlation between school performance and principal 
education, as measured by advanced degrees and graduate training. One explanation, advanced by 
Eberts and Stone (1988), is that highly-educated principals are placed in low-performing schools. Both 
Eberts and Stone (1988) and Ballou and Podgursky (1993) find a positive association between years of 
teaching experience and school performance although Brewer (1993) finds no such correlation. 
Evidence on the impacts of principal experience is also mixed: Eberts and Stone (1988) find positive 
effects while Ballou and Podgursky (1993) find no correlation. These mixed results could reflect 
differences across outcomes, controls, or sample characteristics. None of these studies have addressed 
whether principals that participated in training programs improve school performance, in part because 
these programs are a relatively recent phenomenon. 

This paper constitutes a fourth strand of this literature, one that analyzes these relationships 
using detailed administrative principal data matched to detailed administrative student data. To our 
knowledge, only Branch, Hanushek, and Rivkin (2008) have conducted a similar analysis. Focusing on 
Texas, they document changes in the composition of principals and patterns of principal mobility. They 
then estimate school performance models that include student characteristics and school characteristics, 
principal and school fixed effects and principal experience and tenure. Their interest is in the 
relationship between principal mobility and principal effectiveness (as measured by the estimated 
principal fixed effects) and in the relationship between principal experience and school performance. 
We return to their findings in Section 6, and compare their results with our own. 



6 




Three recent papers have addressed more specific questions relating to principals. Coelli and 
Green (2009) estimate the variation in principal effectiveness across schools. Their approach — a version 
of the method Rivkin, Hanushek, and Kain (2005) use to analyze teacher impacts — ignores specific 
principal characteristics such as education and experience and focuses instead on the correlation 
between within-school performance variation and within-school principal turnover: under some 
assumptions, the strength of this correlation is increasing in the variation in principal effectiveness. One 
important assumption is that student sorting is unrelated to principal effectiveness. If that assumption is 
violated (e.g., if effective principals attract high-achieving students to a school), this method will over- 
state the variation in principal effectiveness. Their results suggest some role for principal quality in 
determining students’ standardized exam scores, but they find little impact on graduation rates. Cullen 
and Mazzeo (2008) examine the relationship between school performance and the principal’s future 
wages. They find that strong performance is associated with increased future wages, which suggests 
that the principal labor market may provide effective incentives for principals. 

Finally, Corcoran, Schwartz, and Weinstein (2009) evaluate the NYC Aspiring Principals 
Program discussed above. They find evidence that APP principals enter schools whose performance is 
on the decline, and that they improve school performance on standardized tests, particularly in English, 
after about three years. We discuss this study in more detail below after presenting our findings relating 
to the APP program. 

Identifying the Effects of Principal Characteristics 

In this section we describe the conceptual issues surrounding the empirical identification of the effects 

of principal characteristics. We begin with a simple model where school performance is a function of 

principal characteristics and a random disturbance term (i.e., postpone any role for teachers and 

students). Specifically, assume the performance of school i at time t (Yu) can be written as a function of 

a time-invariant observed principal characteristic such as post-graduate education (PC;), a time-varying 
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observed principal characteristic such as principal experience (P Eit), a time-invariant unobserved 
principal characteristic which we call ability (PA;), and a random disturbance term (eit): 

(1) Yit = bo + biPCi + biPEu + PAi + eit 

All of the claims made about identification of this three-variable model extend to the case in which 
we have many observable principal characteristics (experience as an Assistant Principal, principal 
training and so on) and many unobservable dimensions (work ethic, leadership skills and so on). 

Since principal ability is unobserved, a regression of the outcome on principal education and 
experience would identify the causal effect of principal education plus an ability bias generated by any 
correlation between education and unobserved ability (conditional on experience) and the causal effect 
of principal experience plus an ability bias generated by the correlation between experience and 
unobserved ability (conditional on education). All of the strategies discussed below will generate 
estimates of principal characteristics that are potentially confounded by these types of ability biases. 

A more realistic model of school performance would allow for the influence of factors such 
as average student background characteristics (Xu), average teacher quality (Tit) and the quality of 
school facilities (Fit): 

(2) Yit = bo + bl PCi + blPEit + b -X a + bill: + bsFit + PAi + eit 

Since this equation refers to school-level achievement, we can interpret the influence of the school peer 
group as comprising the influence of students’ own background characteristics plus any peer effects in 
operation. In this framework there are two additional sources of potential bias that arise from the sorting 
of students and principals to schools. Student sorting biases would arise if, for instance, parents that 
place a high value on education seek out schools with more-educated and more-experienced principals. 
School sorting biases would arise if, for example, more-educated and more-experienced principals 
move to schools with better facilities. 
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To deal with these types of sorting biases we estimate models that include school fixed effects. 
These will capture the effects of any school characteristics that are time-invariant over our sample 
period. In models that include school fixed effects, estimates of the relationship between principal 
characteristics and school performance are identified via a comparison of different principals working 
in the same school. Since student characteristics may vary over the sample period, we also control for 
an extensive set of student characteristics. For middle school students, these characteristics include 

scores on tests taken while in elementary school. 

Even after controlling for school fixed effects and student characteristics, our estimates will still 

be subject to ability biases. For example, among the set of principals that have worked at a particular 
school, those with most experience could also be the most able. Such a relationship would arise if 
principal retention decisions are based partly on principal ability. However, whether we want to correct 
for these ability biases depends on the specific research question under consideration. For both time- 
varying principal characteristics (such as experience) as well as time-invariant characteristics (such the 
quality of one’s post-graduate degrees), there are two questions about the effect of a characteristic X we 
might like to answer: 

a) What is the effect of replacing a principal who has X=xk (i.e., xk years of experience) with a 

principal who has X=xll 

b) What is the effect of X (i.e., years of experience) on the performance of a principal? 

With regard to principal selection policies, the first question is arguably more interesting. To 
comprehend this, consider a school district administrator making a hiring decision. Assuming their goal 
is to hire the most effective principal, they will be interested in knowing whether one candidate will 
perform better in a given school than another candidate. Consequently, they will want know if a given 

1 While the addition of school fixed effects is very helpful in mitigating this bias, identification of the relationship between 
school performance and time-invariant principal characteristics (e.g., prior experience as an assistant principal) will be based 
only on performance of schools which changed principals at least once during our sample. For example, these estimates are 
unlikely to be based on schools that opened near the end of our sample period since these schools will typically only have 
had one principal. 
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characteristic, like whether they graduated from an Ivy League school, is useful for predicting how 
effective they will be as a principal. For this question, the ability “bias” is part of the effect of interest. 

In contrast, with regard to the cost-effectiveness of policies that directly affect the acquisition of 
principal characteristics, the second question is more appropriate. For example, if experience has no 
causal impact on performance, policies that lengthen the careers of all principals (both high and low 
ability) will be ineffective. 

In this paper we focus on the first of these questions, in large part because the second is 
extremely difficult to answer in a non-experimental setting. For example, the effects of time-invariant 
characteristics (such as education) on principal effectiveness can only be identified by variation across 
principals, and without (quasi) random assignment, they are likely correlated with unobserved 
principal characteristics. The effects of experience (or, more generally, time-varying characteristics) 
can be identified via “within-principal” comparisons, but these can be problematic. For example, if 
there is selective attrition, such that not all of the principals with three years of experience stay in the 
system for a fourth year, the estimated return to the fourth year of experience may be “local” to this 
subgroup of stayers. This may be an interesting parameter for some purposes, but may not be a good 
guide to the return to an additional year of experience for a randomly chosen principal with three 
years of experience. 

Another important issue is how the inclusion or exclusion of other school-level inputs into 
Equation 2 will affect the interpretation of the coefficient estimates on principal characteristics. For 
example, if teacher characteristics were completely under a principal’s control, then a specification that 
included teacher characteristics might understate the relationship between principal characteristics and 
school performance. Intuitively, controlling for these characteristics would shut down some of the 
channels through which principals affect school performance. On the other hand, we probably would 
want to control for the quality of student background characteristics, since principals cannot control 
important factors like students’ socio-economic status. If these are uncorrelated with principal 



10 




characteristics then controlling for them will generate more precise estimates; if these are correlated 
with principal characteristics, then controlling for them will help address biases from sorting. 

Finally, we follow the earlier literature on identifying the impact of teaching experience from 
variation within teachers (Rockoff 2004), and assume that the benefit to additional experience is 
negligible after the first several years of a principal’s career. By treating principals with experience 
above some level E * as having the same amount of experience, we eliminate the problem of 
collinearity. In addition, the plausibility of this identifying assumption can be checked by testing 
whether the estimated marginal benefit of experience approaches zero as Eu approaches E*. 

To summarize, our main strategy for identifying the relationship between school performance 
and principal characteristics will be to regress student performance on principal characteristics, student 
background characteristics, school characteristics and school fixed effects. This strategy will identify 
the causal effects of these characteristics plus the relevant ability biases provided that the set of student 
and school controls (and fixed effects) deal with school and student sorting. As noted above, it is not 
plausible to assume that the principal assignment process is random, and these non-experimental 
estimates may not control for all biases due to school and student sorting. It is nevertheless plausible to 
suppose that these estimates can shed important new light on the relationship between school 
performance and principal characteristics, particularly given the paucity of prior evidence. In ongoing 
work, we exploit a natural experimental to assess the extent to which these biases affect our estimates of 
the impacts of principal experience. 
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Data 



In order to provide credible evidence on the relationship between principal characteristics and school 
performance, detailed information on each principal’s education, training, work experience, and other 
traits must be collected. This can be difficult even if principal recruitment is entirely within-district, 
since some principals may have started their careers decades ago. When a district recruits principals 
externally, it is almost impossible. 4 

Second, data are required on school performance, ideally measured as student achievement. 

Until recently, systematic testing of students in U.S. schools at broad geographic levels was uncommon, 
so that data that allow for inter-school comparisons using multiple years of performance are hard to 
come by. Third, for estimated relationships between performance and principal characteristics must 
account for differences in factors outside of principals’ control that affect performance and are 
correlated with principals’ characteristics. Finally, since school principals are a school-level treatment, 
researchers wishing to identify robust relationships between performance and characteristics require 
data containing relatively large numbers of schools. 

The data employed in this paper meet all of these requirements. 5 First, we have information of 
the employment histories of all individuals who worked as principals in New York City public schools 
since 1982. This allows us to generate accurate measures of experience as a principal, assistant 
principal, and teacher. The data also include the school where a principal received her bachelor’s and 
master’s degrees. We use this information in conjunction with information on the median SAT scores of 
undergraduate students at that institution to measure the selectivity of the schools that a principal 
attended. 6 

4 In theory, one could use surveys to collect this information. However, survey responses are likely to contain less 
accurate information than administrative records and are likely to create the potential for bias due to non-response, 
particularly if one attempts to collect data on principals who may have already retired. 

5 More information about the data sets described here is provided in the Data Appendix. 

6 These data were provided to us by Caroline Hoxby. As required by law, every principal in our dataset has a 
master’s degree; hence we do not examine the relationship between master’s degrees and school performance. 
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Additionally, we have information on whether principals participated in special training 
programs, and we focus on two programs in our analysis. The NYC Leadership Academy’s Aspiring 
Principals Program (APP) is a “leadership development program that uses teamwork, simulated-school 
projects and job-embedded learning opportunities to prepare participants to lead instructional 

n 

improvement efforts in New York City’s high-need public schools.” The first APP graduates began as 
principals in the school year 2004-05; by 2007, around 10 percent of all NYC principals were graduates 
of this program. 

In addition to the APP, a smaller number of principals received training through other programs 
(e.g., New Leaders for New Schools). However, the number of principals trained through these 
programs is too small for us to estimate their relationship to school performance with reasonable 

o 

precision. When we have information on a principal’s participation in one of these programs, we 
control for it in our analysis but do not report these coefficient estimates. 

We also look at principals who participated in the Cahn Fellows program. This is “designed to 
support the growth of exemplary school leaders” by “recognizing outstanding NYC principals and 
providing them with opportunities for professional, intellectual and personal growth.” Thus, the Cahn 
Fellows program is quite different from the APP or other principal training programs; it is a 
professional development program that works with sitting principals with four or more years of 
experience in high performing schools, and its focus is on both recognition and training. The first group 
of Cahn fellows was selected in the 2003-2004 school year. Examining the Cahn Fellows program is 
important because it sheds light on whether professional development programs for highly experienced 
and successful principals can help them further improve student performance. It is also interesting to 

7 See http://www.nycleadershipacademy.org/aspiringprincipals/app overview for this and more information. 

8 Almost two percent of principals were trained via the New Leaders for New Schools program, but about half of them 
served in high schools (for whom we have no test score data), and those in elementary and middle schools were typically the 
only principal at the school during our sample period, leaving only a handful of cases identified under our school fixed 
effects approach. In contrast, about 60 Cahn Fellows and 120 Aspiring Principals Program graduates are included in our 
analysis. 
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investigate whether principals chosen to be Cahn Fellows are leading schools that are outperforming 
comparable schools. If they are, then it would suggest that the selection process used to pick Cahn 
Fellows may be a useful in identifying effective principals. 

Our data on school performance span the school years 1998-99 through 2006-07. Our primary 
measures are standardized tests in math and English taken by grade 3-8 students in the roughly one 
thousand elementary and middle schools in NYC. In order to avoid any issues related to changes in test 
design, we normalized the exams’ “scaled scores” by year and grade level to have a mean of zero and a 
standard deviation of one. Thus, our estimates below are scaled in terms of student level standard 
deviations of achievement. Additionally, while test scores in New York improved over this time 
period, our estimates are based on variation in relative performance within the school district. 

In addition to student exam scores, the data include information on other “intermediate” 
outcomes. For students, these include days absent and days suspended, and for teachers they include 
days absent and retention. While student achievement test scores are relatively common measures of 
school output, impacts on these other outcomes are less straightforward to interpret. On the one hand, 
these can be seen as inputs into future school performance: fewer suspensions might reflect less 
disruption in the classroom; fewer teacher absences might reflect higher levels of teacher job 
satisfaction. On the other hand, fewer suspensions may reflect a lax approach to student discipline or a 
change in the likelihood of reporting an incident rather than a change in student behavior. Similarly, 
fewer teacher absences may be associated with lower teacher performance, to the extent that substitute 
teachers perform better than regular teachers that are ill. Nevertheless, these intermediate outcomes 
provide complementary evidence to our analysis of student achievement. 

To control for school sorting biases, we identify the relationship between school performance 

and principal characteristics using school fixed effects (i.e., comparing principals at the same schools). 

As such, it is worth noting that, over this nine year period, we observe over one thousand observations 

on schools experiencing a principal transition. To control for student sorting biases we include detailed 
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controls for the background characteristics of tested students, and the background characteristics of all 
students in the tested grades within the school during the year. 

Descriptive Statistics 

In this section we present some descriptive statistics on the principals in our sample. Table 1 shows 
evidence on the trends in the background and experience of NYC principals. As seen in the left-hand 
columns, over the twenty-year period from 1987 to 2007, the most dramatic trends were in principal 
demographics. The fraction of female principals rose from 33 to 68 percent, and the fraction of non- 
white principals rose from 28 to 49 percent. Similar changes in principals’ gender occurred over this 
period at the national level, but the changes in minority representation nationally were much smaller. 9 
By comparison, changes in average principal experience and the fraction of principals that worked as an 
assistant principal in the same school were quite small. 

The right-hand columns of Table 1 report trends in the composition of new principals in New 
York. The first of these columns reports the number of new principals hired. This increased between 
1987 and 2005, as the number of schools increased, and was especially high in 1992. This 1992 figure 
reflects the large number of new principals hired to replace the large number of principals taking 
advantage of the 1991 retirement incentive scheme. We discuss this scheme in more detail in Section 7. 
The remaining statistics on new principals can be seen as leading indicators of the trends discussed 
above. 

Table 2 describes the increase in the fraction of principals that completed one of the formal 
training or professional development programs observed in our data. At the beginning of the period for 
which we have school performance data, almost no principals had completed such a program, but a 
significant fraction of principals had done so by the school year 2006-07. Over one in ten principals had 
been trained through the APP program, and nearly 4 percent had been participated in the Cahn Fellows 

9 Nationally, the fraction female rose from 25 to 50 percent, while the fraction of non-white principals rose from 13 to 19 
percent (Fiore and Curtin ( 1997), Battle (2009)). 
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program. 



Table 3 presents a broad-brush description of the relationship between principal 
characteristics and the characteristics of students in the schools in which they work. These tabulations 
are significant for the empirical analysis presented below since, as noted above, estimates of the 
performance impacts of principal characteristics will be biased if they do not account for school and 
student sorting. The table is based on principal-school observations pooled across all of the school 
years for which we have matched student- school data (1998-99 through 2006-07). The table reveals a 
strong relationship between principal experience and the fraction of non- white students and an 
especially strong relationship between principal experience and average (contemporaneous) test 
scores. If this relationship reflects students sorting based on factors that are unobserved in our data, 
we will over-estimate the return to principal experience. On the other hand, principal experience does 
not appear to be strongly related to other measures of school advantage such as the fraction of 
students receiving free or reduced-price lunch or the fraction who are English language learners. This 
could be because sorting is strongest with respect to test scores or because principal experience has a 
causal impact on test scores. 

It is interesting to consider the processes that might give rise to the relationship between 
principal experience and student background. For example, suppose principals are, on average, less 
likely to leave advantaged schools compared with disadvantaged schools. Even if job candidates for 
principals were randomly assigned among the schools with principal openings, this process would 
eventually generate a positive correlation between principal experience and student background. 

There might also be a tendency for principals to move up a career ladder, switching to schools serving 
more advantaged students as they acquire more experience. This would reinforce the effects of any 
differential separation behavior. 

Both of these tendencies can be seen in Table 4, which describes the schools from which 
principals separate and, if a transfer occurs, the schools to which they move. Of around 1,200 
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separations, only 12 percent involve a transfer to another elementary or middle school in NYC; the 
remaining transitions (88 percent) are predominantly to retirement or a job outside the NYC department 
of education. The numbers in the first column reflect the relationship between principal separation and 
average student test scores: principal turnover is lower in schools with average test scores in the highest 
quartile (22 percent) than in schools with average test scores in the lowest quartile (30 percent). The 
numbers in the remaining columns show that there is a small tendency for principals who move within 
the school system to move to schools with higher test scores. Of principals moving to a new school in 
our sample, roughly 50 percent stay in the same quartile of average test scores, roughly 30 percent 
move to a higher quartile, and roughly 20 percent move to a lower quartile. In other words, principals 
leaving lower-SES schools are more likely to transfer to higher-SES schools (conditional on 
transferring to an in-sample school). 

Provided that school sorting is in relation to persistent school characteristics, we can obtain 
unbiased estimates of the relationship between principal characteristics and school performance by 
comparing school performance across principals working at the same school (i.e., using school fixed 
effects). With this strategy in mind. Table 5 characterizes experience differences associated with 
principal transitions within a school. We group the “prior” principals into those that had less than four 
years of experience, between four and six years of experience, and seven or more years of experience. 
The “new” principals are grouped in the same way, except that we include an extra category for 
principals in their first year (i.e., zero experience). The table shows the distribution of the new 
principal’s experience conditional on the experience of the prior principal. The results indicate that, 
across all levels of experience of the prior principals, the vast majority of new principals are first-year 
principals (i.e., they have not worked as a principal in another school). This is noteworthy because if 
new principals were assigned to schools randomly, the distribution of the new principal’s experience 
would be the same irrespective of the prior principal’s experience. This is roughly what the patterns in 
Table 5 show, although this does not imply that new principals are chosen at random. 
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Results from Regression Analysis 

In this section we report our estimates of the relationship between principal characteristics and school 
performance. Table 6 reports our estimates of the relationship between principal characteristics and 
math and English test scores. The sample pools all students in grades in grades 3-8. For both math and 
English scores, we present estimates from two models. The first model includes extensive student 
characteristics, school characteristics, and school zip code fixed effects. These controls are included to 
mitigate bias from the possible sorting of high-achieving students to principals with certain 
characteristics. The second model replaces zip code fixed effects with school fixed effects. These 
estimates identify the relationship between principal characteristics and school performance by 
comparing school performance among different principals at the same school, thereby controlling for 
any time-invariant but unobservable school factors, including average unobservable student 
characteristics. 

The estimates reported in the first column of Table 6 reveal four aspects of the relationship 
between principal characteristics and math test scores. First, the estimates suggest that math test scores 
are unrelated to a principal’s educational credentials, as measured by the selectivity of the schools from 
which the principals received their MA and BA degrees. 10 The BA and MA estimates have different 
signs and neither is statistically distinguishable from zero. Second, the estimates provide mixed 
evidence regarding the relationship between math test scores and principal training and professional 
development. Principals that become Cahn Fellows (but have not yet been selected and gone through 
the program) are associated with high-performing schools, which is not surprising given that they are 
selected in part due to a record of good performance. Post-program Cahn Fellows are associated with 
even higher school performance. This differential may reflect the causal effect of this program on 
school performance, although these estimates are still based on across-school comparisons. APP 

1 Since all principals have a master’s degree, we do not attempt to estimate the relationship between these and school 
performance. 
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graduates are associated with low school performance. Given that these principals are selected to work 
in struggling schools, one would be concerned that our estimates are biased due to differences at the 
school level for which we do not control. We return to these points below. 

These estimates would also suggest that math test scores are higher when the principal has more 
experience as either a teacher or an assistant principal at the school where he/she becomes principal. 
These estimates are however small, and are on the margins of statistical significance. 1 1 Finally, the 
estimates suggest that math test scores are increasing in principal experience at the current school. 
There is a monotonic relationship between years as a principal and math test scores such that, other 
things equal, principals with three years of experience are associated with an increase in math scores of 
0.04 standard deviations, while principals with five or more years of experience are associated with 
math scores 0.06 standard deviations higher than principals in their first year. As discussed below, this 
experience differential is around twice as large as that reported Branch, Hanushek, and Rivkin (2008) 
using similar models estimated using Texas data. 

The addition of school fixed effects has a noticeable effect on our estimates, suggesting that 
some estimates from the specification with zip code fixed effects may be biased due to the non-random 
matching of principals and schools. First, the addition of school fixed effects eliminates the significant 
relationship between math scores and prior experience as a teacher or assistant principal at the current 
school. It also substantially weakens the relationship between math scores and the APP, supporting the 
view that these estimates were likely biased by placement procedures. The coefficient on Aspiring 
Principal Program graduates shrinks from 0.081 to -0.045 standard deviations, consistent with their 
placement in difficult schools, though it is still statistically significant. Note that the addition of school 
fixed effects does not ensure that bias is removed completely, since APP graduates may be placed in 

1 1 We also estimated specifications that included total experience as a teacher or assistant principal instead of experience at 
the principal’s current school. The effects of overall experience were estimated to be even smaller and not statistically 
significant than the results reported in Table 6. We focus on prior experience at the current school for the remainder of our 
analysis. 
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schools in which performance is on a declining trend. We investigate this further under the section 
entitled: Additional Analyses of Prior Experience and the Aspiring Principals Program. 

Turning to the coefficients on the Cahn Fellows program, we estimate the difference in school 
performance between Cahn Fellows and other principals who worked in their school to be close to zero 
prior to their selection in to the program. In other words, we cannot reject the hypothesis that Cahn 
Fellows come from good schools but were not more effective than other principals prior to their 
selection. However, there is a marginally significant (p-value = 0.16) positive effect (0.038 standard 
deviations) on Cahn Fellows in years after selection into the program. 

The addition of school fixed effects has little impact on the principal experience estimates, 
which continue to suggest that experienced principals are more effective than principals in their first 
year. The estimates suggest that experience beyond four years has only very small effects on school 
performance, in line with our identification assumption. The magnitude of these estimates also 
suggests that the impact on math test scores for experienced principals who participate in the Cahn 
Fellows program is roughly the same as the effect of a first-year principal acquiring five years of 
experience. 

Overall, the math test score results in Table 6 suggest there is no significant relationship, on 
average, between school performance and pre-principal experience or the selectivity of a principal’s 
undergraduate and graduate institution. We find a positive relationship between school performance 
and experience as a principal, while estimates of the impacts of principal training and professional 
development are mixed. Estimates of the relationship between school performance and English are 
consistent with these conclusions, although the estimated experience profile is slightly flatter. One 
explanation is that English test scores are less sensitive to principals’ actions than math test scores. This 
has been found elsewhere with respect to the impact of teachers (Rockoff 2004; Rivkin, Hanushek, and 
Kain 2005; Kane, Rockoff, and Staiger 2008) and is consistent with the estimates associated with other 
principal characteristics, which are typically smaller for English than math. 
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Table 7 shows results from math achievement models estimated separately for elementary and 
middle schools. Splitting the sample in this way is interesting for two reasons. First, if test scores are 
more malleable in earlier grades, then one might expect principals to “matter” more in elementary 
schools. There is some empirical support for this view. For example, the association between 
principal experience and math scores is steeper in elementary schools than it is in the full sample, and 
the coefficients are actually larger in the specification that controls for school fixed effects. In 
contrast, the estimated experience profile for middle schools is much flatter. In the specification that 
controls for school fixed effects, experience has very little association with math scores. 

A second reason for looking at elementary and middle schools separately is that almost all 

students in New York City switch schools between elementary and middle school. Thus, most middle 

school students have at least one prior score that is not influenced by the current principal. This makes it 

possible to compare the estimates from middle school models that do and do not control for pre-middle 

school test scores. The degree to which estimates from these two models are similar is indirectly 

informative about the validity of the identification assumption needed for the estimates in Table 6 to be 

causal (i.e., whether principal characteristics are correlated with unobservable determinants of student 

12 

outcomes). Reassuringly, the estimates are not very sensitive to the inclusion of prior test scores. 

Table 8 presents our estimates of the relationship between measures of student behaviors and 
principal characteristics. We focus on student absences and student suspensions, the two behavior 
measures in our dataset. Estimates of the relationship between student absences and principal 
characteristics are consistent with estimates of the relationship between student test scores and principal 
characteristics. In the models with zip code fixed effects and school fixed effects, there is no evidence 
of a relationship between student absence and principal education or pre-principal experience, and 
mixed evidence on the relationship between student absence and principal training and professional 
development programs. Specifically, absences show no significant relationship with having a school run 
12 

English achievement models (available from the authors) point to similar conclusions. 
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by one of the APP principals, but they are estimated to fall by -.29 absences per child (p-value = 0.15) 
in the years after a principal enters the Cahn Fellows program. There is clear evidence that student 
absences are decreasing in principal experience. The addition of school fixed effects flattens the 
estimated experience profile, consistent with principals gradually learning how to control student 
absence, perhaps because it takes time to put in place procedures (e.g., assigning personnel to find 
truant students) to reduce absences. 

Estimates of the relationship between student suspensions and principal characteristics suggest 
that suspensions fall monotonically with principal experience. Additionally, suspensions are estimated 
to be higher with APP principals and lower with principals that have more pre-principal experience as a 
teacher/assistant principal in their school (though both of these associations may reflect changes in 
student behavior or changes in reporting practices). Overall, as revealed by the adjusted R-squared 
values, these models explain only a small fraction of the variation in student suspensions in the sample. 

Next, we present results on teacher absences and turnover (Table 9). For ease of exposition, we 
restrict attention to specifications with school fixed effects. Of all the characteristics we examine, only 
pre-principal experience as an assistant principal in the current school is significantly related to teacher 
absences. Since principals that previously worked as an AP in their current school did so for 4.6 years 
on average, the coefficient of 0.033 implies that teacher absences are about 0.15 higher under principals 
who were previously an AP in their schools. 

In the turnover regressions, we see a negative correlation between teacher turnover and principal 
experience. These estimates suggest that turnover rates are approximately 1 percentage point lower 
under a principal with five or more years of experience relative to a new principal. Again, the 
normative implications of these finding are unclear. Teacher turnover can have negative consequences 
for student achievement, since experienced teachers are, on average, more effective than new teachers 
(Rockoff 2004; Rivkin, Hanushek, and Kain 2005). However, experience is not the only determinant of 

teacher quality, and new principals may be more likely to force out teachers they believe are ineffective. 
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Equally plausible, however, is that new principals make changes that some teachers view negatively, 
resulting in voluntary departures. 

Additional Analyses of Prior Experience and the Aspiring Principals Program 
One surprising aspect of our baseline analysis is the lack of any strong systematic relationship between 
school performance and pre-principal experience as a teacher or assistant principal. As mentioned 
above, the fraction of principals who previously worked as a teacher or assistant principal in their 
school was roughly 10 and 20 percent, respectively. However, it is important to recognize that our 
specification restricted the coefficient on pre -principal experience to be the same across all principals. A 
more reasonable hypothesis is that prior experience might be beneficial for new principals, but might 
become less important as principals spend more time on the job. 

To investigate this possibility, we categorized principals as being either inexperienced (in their 
first two years) or experienced (three or more years), and interacted this variable with measures of pre- 
principal experience. We also controlled for school fixed effects. Among inexperienced principals, 
experience as an assistant principal in the current school is positively correlated with school 
performance (test scores, absences and suspensions), although experience as a teacher is not. Among 
experienced principals, neither type of pre-principal experience appears useful. This suggests that hiring 
an assistant principal to fill a principal vacancy may be helpful, particularly in schools where principal 
turnover is high. An alternative interpretation, however, is that if an assistant principal takes over as 
principal in the same school, this is an indicator of stable and well-functioning policies that make 
operation by a new principal easier. 

We conclude our analysis of principal characteristics by taking a closer look at the schools that 
recruit their new principal from the APP. As mentioned above, in evaluating the performance of these 

13 

When school performance is proxied with teacher-based measures, there is no systematic relationship between school 
performance and pre-principal experience for either inexperienced or experienced principals (results available upon 
request). 
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principals, one might worry that APP graduates are placed in schools that are not only low-performing 
but on a declining trend. We therefore estimate specifications that allow the coefficient on having an 
APP principal to vary in the two years before and the three years after the APP graduate is hired. For 
purposes of comparison, we include the same indicators for schools that transition to a new principal 
but do not hire their new principal from the APP. We limit our sample to schools that hired a new 
principal between 2004-05 and 2006-07, the years in which the APP graduates began to be hired. 
These results are shown in Table 11. 

Figure 1 provides a graphical illustration of these estimates. Specifically, it shows predicted test 
score trajectories at schools that transition to an APP graduate and schools that transition to a principal 
trained through a more traditional route. The predictions assume that in both types of schools the old 
principal had at least five years of experience and the new principal had none. The drop in performance 
associated with the principal transition is not surprising: the new principals have no experience and we 
have shown that less experience is associated with lower school performance. More interesting are the 
trends leading into and away from this transition and the differences in these trends across schools that 
do and do not hire APP graduates. In particular, while both types of schools tend to experience a 
principal transition after a test score decline, graduates of the Aspiring Principals Program tend to take 
over schools that experienced a noticeably greater pre-placement decline. This suggests that APP 
graduates are placed in schools that are already on a downward trajectory. The graph also suggests that 
these relative performance trends continue beyond the principal transition: the performance drop 
associated with the transition is larger at the schools hiring an APP graduate, and these relative 
performance trends are not reversed until three years later, and then only for English. 14 These 
qualitative results are insensitive to whether we examine trajectories with respect to overall experience 
as a principal or experience as a principal within a school (which, for most principals, are the same 

14 Since the first cohort of Aspiring Principals entered school in 2004-2005, we cannot measure impacts beyond three years. 
Also, these estimates may be confounded by changes in the quality of principals produced in different program cohorts. For 
example, if the initial cohort of Aspiring Principals were better than those which followed, this might drive our finding that 
school performance improves somewhat in their third year. 
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thing). 



It is worth noting that our regression specification attributes variation in school performance to 
the current school principal. However, like any new manager, new principals are likely to be 
constrained by prior policy decisions, e.g., most of their teachers were hired by someone else. This 
raises two interpretation issues. First, it is unclear whether one would expect APP graduates to be able 
to make substantive changes to their schools in the time horizon we examine. Second, if APP graduates 
are more likely than other principals to take over schools whose previous principal made very poor 
policy decisions, we may reasonably expect their schools to underperform in the short run relative to 
other schools with new principals. One could fully address this problem with exogenous variation in 
the assignment of APP graduates among a set of schools hiring a new principal, but this is not available. 

The results of our analysis of APP principals are slightly different than those obtained by 
Corcoran, Schwartz, and Weinstein (2009). Specifically, they find that performance at schools run by 
APP principals improves within a year of the APP principal taking over - more quickly than our results 
suggest. There are several differences between the two studies that could account for these different 
findings. First, the two sets of estimates are based on different samples. Corcoran, Schwartz, and 
Weinstein (2009) analyze a subset of schools that comprise a “balanced panel” in order to conduct a 
longitudinal analysis of the programs impacts. Specifically, they focus on members of the first two 
cohorts of APP principals that remained in their school for three or more consecutive years, serving as a 
principal throughout this period, and a set of comparison principals selected using similar criteria. 15 
Our sample includes all members of the first three cohorts of APP graduates hired by elementary and 
middle schools. In addition, their study period covers the 2002-03 through 200708 school years whereas 
our data ends in the school year 2006-07. Second, the two sets of estimates use different sets of 
controls. In particular, Corcoran et al (2009) use school-level data and do not control for any of the 

15 The first two cohorts of APP principals analyzed in Corcoran, Schwartz, and Weinstein (2009) became principals in the 
2004-05 or 2005-06 school years. About 27 percent (32 out of 120) of APP-trained personnel who were placed as principals 
are excluded from the analysis (author’s calculations based on Table 1 of Corcoran, Schwartz, and Weinstein (2009). 
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student-level variables controlled for here. 



While it is tempting to focus on these differences, our studies have much in common. In 
particular, both studies show that APP principals are assigned to schools in which performance is below 
average and trending downwards, and both studies find evidence that schools run by APP principals do, 
eventually, show relative performance improvements. In our view, given the difficulties involved in 
estimating the causal effects of this program, it is not surprising that differences in sample and method 
could generate differences in the timing and strength of these improvements. Additionally, both studies 
estimate these effects imprecisely, due to the small numbers of principals trained in this program (e.g., 
there are only 88 APP-trained principals in the data used by Corcoran, Schwartz, and Weinstein (2009). 

Discussion of Results 

Our analysis suggests three main findings. First, we find little evidence of any relationship between 

school performance and principal education or pre -principal work experience. One exception is that, for 

inexperienced principals, we find a positive effect of having worked as an assistant principal in the 

same school. Second, we find a positive relationship between principal experience and school 

performance, particularly for math test scores and student absences. The experience profile is especially 

steep over the first few years of principal experience. Third, we find mixed evidence on the relationship 

between principal training and professional development programs and school performance, with the 

caveat that the non-random process by which program participants are hired by or selected from schools 

makes it hard to identify their effects. Our estimates suggest that when an experienced principal 

becomes a Cahn Fellow, school performance improves. On the other hand, our estimates suggest that 

when a school is assigned a principal from the APP, relative school performance does not improve (and 

may even drop) in the short run, but may improve in the longer run. 16 In addition to the problems 

16 Note that there were significant citywide gains in math and English achievement during our study period, particularly in 
the NYC schools serving poor students. Thus, students in schools led by APP principals were likely to have made absolute 
gains in achievement, but our findings suggest they were smaller than those experienced in schools led by new non- APP 
principals. 
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caused by non-random selection and assignment, our data on APP graduates is relatively limited — 
covering only three years and three cohorts — suggesting further study is necessary to gauge the merits 
of this training program for new principals. 

The study most comparable to ours is Branch et al. (2008), which also uses panel data to study 
the relationship between school performance and principal characteristics. In the most comparable 
specification, in which they examine the relationship between principal experience and math 
achievement among grade 3-8 students, they estimate a return to experience around one half of the size 
of that estimated here. In particular, they estimate that relative to a first-year principal, a principal with 
six or more years of experience is associated with a 0.025 or a 0.017 increase in test scores, depending 
whether the specification includes school fixed effects. Our estimates suggest effects of 0.061 and 
0.039. 

There are several reasons why our estimates might differ from theirs. First, the returns to 
experience might be higher in NYC. This would be the case if the NYC environment were one in which 
principal learning is more important. Second, ability biases might be larger in NYC (recall that both 
sets of estimates will be affected by ability biases). Ability biases would be larger in NYC if less able 
principals were more likely to quit or be terminated early in their career. 

To get a handle on this last possibility, we estimated models that include principal-by- school 
fixed effects. If principals spend their entire careers in one school, these will identify the same 
parameter as that identified by the school fixed effects models (provided both are free of school and 
student sorting biases). If principals move schools, the two approaches will estimate slightly different 

i o 

parameters, although the difference between them should not be large. As seen in Table 12, replacing 

17 

One simple explanation is that a standard deviation of achievement among Texas students represents twice as much 
“true learning” as a standard deviation in NYC. This is highly unlikely; NAEP data show that variance in achievement 
within NYC and within Texas are both close to the national variance in achievement. 

18 

The principal-by-school fixed effects models will identify the return to an additional year of experience in the same 
school, while the school fixed effects models will identify the average return to an additional year of experience (in the same 
school or in general). If much of the return to experience is school specific, then the principal-school fixed effects estimates 
may be higher. However, in practice this is unlikely to be important because few principals work in multiple schools. 
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school fixed effects with school-by-principal fixed effects generates slightly smaller point estimates 
(especially for English, where the experience effects become insignificant), although a statistical test 
could not distinguish between the two. This suggests that the NYC-Texas differences are unlikely to be 
driven by ability biases. 

A third explanation for the steeper NYC experience profile is that the NYC estimates suffer 
from greater student and school sorting biases. For example, there may be more scope for successful 
NYC principals to sort into schools that are high performing in ways that cannot be controlled for using 
student characteristics. This explanation is difficult to address without exogenous variation in principal 
experience. In the next section, we describe a natural experiment that can help us assess whether the 
NYC estimates are driven by selection bias. 

Do Inexperienced Principals Hurt School Performance: A Natural Experiment 

An important finding to emerge from our analysis is the positive impact of principal experience, 
particularly over the first few years of principals’ careers. Since this implies that new, inexperienced 
principals will, on average, hurt school performance, it has at least two implications. First, it implies 
that policies that lengthen principals’ careers will, on average, improve school performance, since there 
will be fewer first-year principals. Second, it implies that a positive correlation between principal 
experience and student background may exacerbate inequality within the NYC education system. 

Since this finding has important implications, we would like to be sure that it is not biased by 

student or school sorting. While our estimates are based on models that include school fixed effects, 

ruling out sorting on persistent school characteristics, they could be biased by sorting on transitory 

school characteristics or on student characteristics. For example, if a school is renovated halfway 

through the sample period, and if this attracts both higher performing students and a more experienced 

principal, we may falsely attribute this student-induced improvement to the more experienced principal. 

More generally, more-experienced principals may be attractive to parents of relatively advantaged 

students, perhaps because they suggest a greater degree of school stability. If our data do not capture 
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these characteristics, we will falsely attribute differences in student performance driven by these 
characteristics to the actions of the experienced school principal. This is a problem faced by all existing 
studies on the impact of principals on student outcomes. 

In ongoing work, we examine principal experience using a natural experiment. The “Principal 
Retirement Incentive,” introduced by NYC in 1991, aimed to help alleviate a budget crisis by inducing 
early principal retirement. Three aspects of this scheme make it an especially useful tool for identifying 
the effects of principal experience. First, the scheme induced many retirements: around 250 of NYC’s 
1,000 principals retired in 1991. Second, the scheme’s eligibility rules provide a research design for 
examining principal retirement. In particular, principals wishing to take advantage of the scheme had to 
have accrued certain levels of experience or been of a certain age in 1991. We can therefore use a 
regression discontinuity design to take advantage of these rules: comparing outcomes at schools with a 
principal that narrowly qualified for the scheme and at schools with a principal that narrowly failed to 
qualify for the scheme. Third, the scheme took schools and parents almost completely by surprise: it 
was first proposed in April of 1991 and all of the retirements had occurred by September of 1991. 

To see the potential of this experiment more clearly, consider a simplified version of the scheme 
in which eligibility is based on a single index and there is complete compliance. That is, suppose that in 
1991, all schools with principals below age 55 kept their principal while all schools with principals 
above age 55 lost theirs. Suppose also that the schools that lost principals replaced them with a random 
draw from a pool of potential new principals. The effects of principal experience could then be 
identified via a comparison of next year’ s outcomes at control schools (which kept their experienced 
principal) and treatment schools (that replaced their experienced principal with an inexperienced one). 

As with the school fixed effects estimates, this comparison will identify the combined effect 
of the difference in experience and the differences in observed and unobserved characteristics 
between the two groups. The quasi-experimental nature of this variation will however allow us to 
interpret these estimates as causal effects of principal experience under weaker assumptions than 
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were used in the school fixed effects analysis. Rather than rely on school fixed effects and student 
controls to capture school and student sorting, we will rely on the assumption that the scheme was 
unanticipated and that there would, in the absence of the scheme, have been no difference in the 
performance of schools led by principals that were narrowly eligible and narrowly ineligible for the 
scheme. 

Conclusion 

This paper provides one of the first sets of estimates of the relationship between school performance 
and school principal characteristics. The conventional wisdom is that principals exert an important 
influence on school performance but this view is based mainly on anecdotal evidence. We focus on 
three sets of characteristics: principals’ education and pre -principal experiences; principal 
experience and principal participation in principal training programs, now popular in many school 
districts. 

We find little evidence of any relationship between school performance and principal education 
and pre-principal characteristics, although we do find some evidence that experience as an assistant 
principal in their current school is associated with higher performance of schools led by inexperienced 
principals. We find mixed evidence in regard to principal training and professional development 
programs, although program rules are such that it is hard to isolate the effects of these programs. Our 
clearest finding is that schools perform better when they are led by experienced principals. The 
experience profile is especially steep over the first few years. 

The finding of positive experience effects accords with common sense: workers performing 
most tasks will become more productive with experience at the task, especially when the task is as 
demanding as running a school. It does however have important policy implications. First, it alerts 
district administrators to the potential costs of having experienced principals leave their jobs or 
equivalently, informs them of the benefits associated with retaining experienced principals. Second, it 



alerts district administrators to the distributional consequences that follow from higher rates of principal 
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turnover in disadvantaged schools. Given the importance of these implications, and since our estimates 
are larger than those found in the related literature, we would like to be sure that they are not biased by 
the sorting of experienced principals to good schools and good students. In the analysis presented here 
we have taken a number of steps to ensure these biases are minimized. In ongoing work we seek to 
exploit a natural experiment to obtain quasi-experimental estimates of these experience effects. 
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Data Appendix 

The data used in the paper come from several sources: personnel files on principals, personnel files on 
teachers, and student achievement and enrollment files. These data are formatted separately and then 
merged together by school and school year. Each employee in the NYC school system has a service 
history file, which traces the location and type of work they perform over time. These files are in spell 
format, e.g., one line in an individual’s file might state that he/she worked as an assistant principal in a 
particular school from September 1, 1995 through June 30, 1998. We use data on the complete service 
history for every person that worked as a NYC principal between 1982 and 2007. Because these files 
contain full employment histories, the earliest spells date from the 1940s (when several of these 
principals were teachers in NYC) and extends through the school year 2006-2007, by which time many 
principals had left NYC or moved into a non-principal position (e.g., district administrator). We record 
these spell data to determine principals’ activities in the fall and spring of every school year during their 
employment history. We then used these data to measure principal experience in teaching, as an 
assistant principal, and as a principal, both overall and within their current school. For individuals that 
retired before 1992, some of the employment information is truncated in 1982. For this reason, our 
descriptive statistics are limited to principals employed from 1992 onwards, but our information is 
complete for all principals working in the period for which we analyze school performance. 

In addition to these service history files, we were provided with data on principals’ tertiary 
education, including institutions from which degrees were received, and training programs the 
principal attended. These data are only available for the more recent period on which we focus our 
analysis. We use data on degree attainment to create variables for the selectivity of the institution 
from which the principal received their BA and MA degrees. 

Student data contain information on demographic background, attendance, suspensions from 
school, test performance, eligibility for free lunch, special education and bilingual education. Teacher 
personnel files contain information on teacher characteristics (e.g., education, experience, certification), 
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employment history (from which we derive measures of turnover), and the timing and type of absences 
taken during the year. We focus on absences that are coded as “Self-treated Sickness” and “Personal 
Days,” which are arguably under teachers’ control. Other types of absences include jury duty, military 
service, funeral attendance, and medically certified illnesses. These data are described in greater detail 
in Kane, Rockoff and Staiger (2008) and Herrmann and Rockoff (2009). 
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Figure 1: School Performance During the Transition to New Principal 




• Math Scores — APP Principal Transitions ■ English Scores — APP Principal Transitions 
■ Math Scores — Non-APP Transitions “ *■ English Scores — Non-APP Transitions 



Note: This figure plots the expected change in student test score performance as a school transitions from a principal 
with 5 or more years of experience to a new principal. The estimates used to create the figure are those shown in Table 
1 1 and an estimate (from the same regressions) of the impact of a principal having 5 or more years of experience (0.042 
for math scores and 0.026 for English scores) relative to no experience. We estimate transition effects separately for 
schools that transitioned to the new principal from the Aspiring Principals Program (APP) and those that transitioned to 
a principal trained through a more traditional route during the school years 2004-05 through 2006-07. 






New Principals 



Table 1: Principal Characteristics in New York City, 1987-2007 









Table 2: Training and Professional Development Programs, 2002-2007 







Pre-Principal Training 




Professional Development 




Aspiring 

Principals 


New Leaders 
for New 
Schools 


Tom- 

orrows 

Principals 


Bank St 
Academy 


Cahn Fellows 


2002 


0% 


0% 


0% 


0.08% 


0% 


2003 


0% 


0% 


0% 


0.08% 


0% 


2004 


0% 


0% 


0% 


0.16% 


0.72% 


2005 


4.77% 


0.73% 


0% 


0.22% 


1.91% 


2006 


8.03% 


1.27% 


0.28% 


0.28% 


3.10% 


2007 


11.43% 


1.72% 


0.55% 


0.28% 


3.86% 



Notes: Cells describe percentage of principals in each academic year that had, at some time, 



participated in each of these training programs 





Table 3: School Characteristics by Principal Experience 



Years of 

Principal Experience 


Number of 
Principals 


% Receiving 
Free Lunch 


% Receiving 
Reduced-price Lunch 


% English 
Language Learners 


% Non- 
White 


Average Prior 
English Score 


Average Prior 
Math Score 


0 


1301 


0.74 


0.09 


0.11 


0.90 


0.02 


0.01 


1 


1175 


0.74 


0.09 


0.11 


0.90 


0.04 


0.04 


2 


1061 


0.74 


0.09 


0.11 


0.89 


0.06 


0.07 


3 


814 


0.74 


0.09 


0.11 


0.89 


0.09 


0.11 


4 


644 


0.73 


0.10 


0.10 


0.89 


0.09 


0.11 


5-6 


901 


0.72 


0.11 


0.10 


0.89 


0.09 


0.12 


7-9 


916 


0.74 


0.08 


0.10 


0.88 


0.14 


0.16 


10+ 


999 


0.76 


0.08 


0.10 


0.86 


0.14 


0.18 



Notes: Table based on 7811 principal-year observations for 2076 principals in 1064 elementary and middle schools over the school years 1998- 
99 to 2006-07. Prior English and math scores refer to prior test scores of students tested in the current year and are expressed as z-scores, with 
a mean of zero and standard deviation of one within grade-level and year. There are fewer principal-year observations for these variables 
(6941 for math, 6937 for English) because prior scores are not available in some years and for some schools. Also, schools serving only 
special education students are omitted, since many of these students are not tested, hence this population's average score is above zero. 







Table 4: School Characteristics and Principal Transitions across Schools 



Old School 


New school 


N/A 


SES-1 


SES-2 


SES-3 


SES-4 


SES-1 (29.79%) 


81.6% 


10.8% 


4.4% 


2.2% 


1.0% 


SES-2 (26.41%) 


93.2% 


0.7% 


3.6% 


1.4% 


1.1% 


SES-3 (22.18%) 


92.8% 


0.9% 


2.6% 


1.3% 


2.6% 


SES-4 (21.62%) 


92.6% 


0.9% 


1.3% 


1.3% 


3.9% 


Overall 


88.5% 


4.2% 


3.2% 


1.9% 


2.3% 



Notes: The table describes the characteristics of the schools that principals leave and the schools to which that 
they transfer (if a transfer is made). School SES is defined in terms of quartiles of the prior math score 



distribution, where prior reading and math scores refer to prior test score of students tested in the current year 
and are expressed as z-scores, with a mean of zero and standard deviation of one within grade-level and year. 
The data cover the school years 1998-99 to 2006-07. There are 1,246 principal separations involving 1180 
principals in 784 elementary and middle schools. 










Table 5: Principal Experience and Transitions Within Schools 





New Principal Expereince 




None 


One to Three 


Four to Six 


Seven or More 


Prior Principal Experience 


(First Y ear) 


Years Exp. 


Years Exp. 


Y ears Experience 


All Experience Levels 


84.0% 


10.1% 


3.3% 


2.6% 


One to Three Years (43.03%) 


83.5% 


12.1% 


2.1% 


2.3% 


Four to Six Years (21.51%) 


83.5% 


10.2% 


3.0% 


3.4% 


Seven or More Years (35.46%) 


84.8% 


7.7% 


4.9% 


2.6% 



Notes: The table describes experience levels among outgoing and incoming principals when a principal 
transition occurs within a school. The data cover the school years 1998-99 to 2006-07 onwards. There are 
1097 such transitions involving 745 principals in 1043 elementary and middle schools. 











Table 6: Student Achievement and Principal Credentials 



Math Test Scores English Test Scores 



Selectivity of BA School (Median SAT) 
Selectivity of MA School (Median SAT) 
Aspiring Principals Program Graduate 
Cahn Fellow (Pre-selection) 

Cahn Fellow (Post-selection) 

Years as Assistant Principal (current school' 
Years as Teacher (current school) 

1 Year as Principal 

2 Years as Principal 

3 Years as Principal 

4 Years as Principal 

5 or More Years as Principal 

Joint Test of Principal Experience (p-value) 
Zip Code Fixed Effects 
School Fixed Effects 
Observations 



0.001 


-0.003 


0.003 


-0.002 


(0.003) 


(0.002) 


(0.003) 


(0.002) 


-0.002 


- 0.000 


-0.004 


-0.002 


(0.003) 


(0.003) 


(0.003) 


(0.002) 


-0.062* 


-0.036+ 


-0.064* 


-0.035* 


(0.022) 


(0.019) 


(0.018) 


(0.013) 


0.110* 


0.003 


0.109* 


0.016 


(0.028) 


(0.024) 


(0.026) 


(0.015) 


0.189* 


0.038 


0.174* 


0.039+ 


(0.034) 


(0.029) 


(0.035) 


(0.020) 


0.005* 


0.000 


0.004* 


-0.001 


(0.002) 


(0.001) 


(0.002) 


(0.001) 


0.002+ 


- 0.000 


0.002* 


- 0.000 


(0.001) 


(0.001) 


(0.001) 


(0.001) 


0.009+ 


0.007* 


0.002 


0.001 


(0.005) 


(0.004) 


(0.004) 


(0.003) 


0.027* 


0.023* 


0.012* 


0.009* 


(0.006) 


(0.005) 


(0.005) 


(0.004) 


0.038* 


0.035* 


0.017* 


0.013* 


(0.007) 


(0.006) 


(0.007) 


(0.005) 


0.039* 


0.037* 


0.025* 


0.020* 


(0.008) 


(0.006) 


(0.008) 


(0.006) 


0.061* 


0.039* 


0.048* 


0.026* 


(0.009) 


(0.006) 


(0.009) 


(0.005) 


0.00 


0.00 


0.000 


0.000 


Y 




Y 






Y 




Y 


3,690,658 


3,690,658 


3,367,302 


3,367,302 


0.33 


0.34 


0.32 


0.34 



Adjusted R2 

Note: The unit of observation is a student-year. All regressions include year-grade fixed effects, 
controls for students' grade level, ethnicity, gender, grade repetition, and participation in free lunch, 
English language learner, and special education programs, and school-year cell averages of the student 
level controls and class size. Coefficients on individual student characteristics are allowed to differ by 
grade level. Standard errors are clustered by school. 






Table 7: Impact of Student Test Controls in Middle School Sample 



Math Test Scores English Test Scores 



Elementary 


Middle 


Middle 


Elementary 


Middle 


Middle 


Selectivity of BA School (Median SAT) 


-0.001 


-0.003 


-0.001 


- 0.000 


-0.003 


0.001 




(0.003) 


(0.004) 


(0.004) 


(0.003) 


(0.003) 


(0.003) 


Selectivity of MA School (Median SAT) 


- 0.000 


-0.001 


- 0.000 


-0.001 


-0.002 


- 0.000 




(0.003) 


(0.004) 


(0.004) 


(0.003) 


(0.003) 


(0.004) 


Aspiring Principals Program Graduate 


-0.019 


-0.054* 


-0.027 


-0.011 


-0.050* 


-0.008 




(0.022) 


(0.025) 


(0.022) 


(0.018) 


(0.019) 


(0.017) 


Cahn Fellow (Pre-selection) 


0.011 


-0.003 


0.016 


0.001 


0.030 


0.049 




(0.030) 


(0.035) 


(0.045) 


(0.020) 


(0.019) 


(0.034) 


Cahn Fellow (Post-selection) 


0.033 


0.050 


0.060 


0.033 


0.031 


0.044 




(0.029) 


(0.056) 


(0.065) 


(0.027) 


(0.027) 


(0.039) 


Years as Assistant Principal (current schoc 


0.003+ 


-0.002 


-0.002 


0.004* 


-0.004* 


-0.003* 




(0.002) 


(0.002) 


(0.002) 


(0.001) 


(0.002) 


(0.001) 


Years as Teacher (current school) 


- 0.000 


- 0.000 


0.001 


0.000 


-0.001 


0.001 




(0.001) 


(0.001) 


(0.002) 


(0.001) 


(0.001) 


(0.002) 


1 Year as Principal 


0.011* 


0.003 


0.002 


0.006 


-0.004 


-0.004 




(0.005) 


(0.006) 


(0.007) 


(0.004) 


(0.005) 


(0.006) 


2 Years as Principal 


0.029* 


0.017* 


0.018+ 


0.010* 


0.007 


0.009 




(0.006) 


(0.008) 


(0.009) 


(0.005) 


(0.006) 


(0.007) 


3 Years as Principal 


0.035* 


0.036* 


0.032* 


0.009 


0.019* 


0.014 




(0.007) 


(0.009) 


(0.011) 


(0.006) 


(0.007) 


(0.009) 


4 Years as Principal 


0.041* 


0.033* 


0.032* 


0.013* 


0.032* 


0.030* 




(0.008) 


(0.009) 


(0.011) 


(0.007) 


(0.009) 


(0.010) 


5 or More Years as Principal 


0.050* 


0.027* 


0.022* 


0.033* 


0.019* 


0.015 




(0.008) 


(0.010) 


(0.011) 


(0.006) 


(0.009) 


(0.010) 


Joint Test of Principal Experience (p-value 


0.00 


0.00 


0.04 


0.000 


0.010 


0.050 


Student Test Controls 






Y 






Y 


Observations 


2,135,113 


1,555,545 


1,555,545 


2,008,796 


1,358,506 


1,358,506 


Adjusted R2 


0.33 


0.37 


0.48 


0.32 


0.37 


0.49 



Notes: All specifications include the controls listed in the notes to Table 6. Third and sixth columns also include controls for 
student's prior (i.e., elementary school) test score. 






Table 8: Student Behavior and Principal Credentials 





Absences 


Suspensions/100 


Selectivity of BA School (Median SAT) 


0.047 


0.016 


0.001 


0.001 




(0.030) 


(0.028) 


(0.001) 


(0.001) 


Selectivity of MA School (Median SAT) 


0.012 


-0.005 


0.000 


0.000 




(0.032) 


(0.028) 


(0.001) 


(0.001) 


Aspiring Principals Program Graduate 


0.116 


0.095 


0.026* 


0.021* 




(0.186) 


(0.174) 


(0.007) 


(0.008) 


Cahn Fellow (Pre-selection) 


-0.744* 


-0.058 


-0.005 


-0.007 




(0.210) 


(0.151) 


(0.004) 


(0.008) 


Cahn Fellow (Post-selection) 


-0.966* 


-0.287 


-0.002 


-0.003 




(0.290) 


(0.200) 


(0.008) 


(0.011) 


Years as Assistant Principal (current school) 


-0.018 


-0.003 


-0.001* 


-0.001+ 




(0.020) 


(0.018) 


(0.000) 


(0.000) 


Years as Teacher (current school) 


-0.009 


-0.003 


-0.000* 


-0.001+ 




(0.011) 


(0.009) 


(0.000) 


(0.000) 


1 Year as Principal 


-0.093 


-0.107* 


0.000 


-0.001 




(0.061) 


(0.052) 


(0.002) 


(0.002) 


2 Years as Principal 


-0.185* 


-0.183* 


- 0.000 


-0.002 




(0.072) 


(0.062) 


(0.002) 


(0.002) 


3 Years as Principal 


-0.143 


-0.166* 


-0.005* 


-0.007* 




(0.090) 


(0.073) 


(0.002) 


(0.002) 


4 Years as Principal 


-0.291* 


-0.264* 


-0.004 


-0.005+ 




(0.094) 


(0.078) 


(0.002) 


(0.003) 


5 or More Years as Principal 


-0.411* 


-0.240* 


-0.005* 


-0.004* 




(0.098) 


(0.069) 


(0.002) 


(0.002) 


Joint Test of Principal Experience (p-value) 


0.000 


0.010 


0.020 


0.050 


Zip Code Fixed Effects 


Y 




Y 




School Fixed Effects 




Y 




Y 


Observations 


3,851,268 


3,851,268 


3,851,268 


3,851,268 


Adjusted R2 


0.13 


0.15 


0.03 


0.04 


Note: The unit of observation is a student-year. All regressions include year-grade fixed effects, 
controls for students' grade level, ethnicity, gender, grade repetition, and participation in free lunch, 
English language learner, and special education programs, and school-year cell averages of the student 
level controls and class size. Coefficients on individual student characteristics are allowed to differ by 
grade level. Standard errors are clustered by school. 






Table 9: Teacher Outcomes and Principal Credentials 





Total 


Does Not 


Does Not 




"Voluntary" 


Return to 


Return to 




Absences 


NYC Next Year 


School Next Year 


Selectivity of BA School (Median SAT) 


-0.025 


0.001 


-0.001 




(0.017) 


(0.001) 


(0.002) 


Selectivity of MA School (Median SAT) 


0.022 


-0.001 


0.000 




(0.018) 


(0.001) 


(0.002) 


Aspiring Principals Academy 


-0.132 


0.007 


0.017+ 




(0.102) 


(0.005) 


(0.010) 


Cahn Fellow (Pre-selection) 


-0.142 


0.007 


-0.004 




(0.157) 


(0.006) 


(0.009) 


Cahn Fellow (Post-selection) 


-0.234 


- 0.000 


-0.016 




(0.207) 


(0.008) 


(0.011) 


Years as Assistant Principal (current school) 


0.033* 


- 0.000 


-0.001 




(0.009) 


(0.000) 


(0.001) 


Years as Teacher (current school) 


-0.001 


0.000 


- 0.000 




(0.006) 


(0.000) 


(0.000) 


1 Year as Principal 


0.004 


-0.003 


-0.003 




(0.032) 


(0.002) 


(0.003) 


2 Years as Principal 


0.011 


-0.001 


- 0.000 




(0.035) 


(0.002) 


(0.004) 


3 Years as Principal 


0.003 


-0.006* 


-0.009* 




(0.041) 


(0.002) 


(0.004) 


4 Years as Principal 


-0.056 


-0.006* 


-0.011* 




(0.044) 


(0.003) 


(0.004) 


5 or More Years as Principal 


-0.023 


-0.004+ 


-0.010* 




(0.044) 


(0.002) 


(0.004) 


Joint Test of Principal Experience (p-value) 


0.63 


0.07 


0.01 


Observations 


425.872 


426.517 


426.517 


Adjusted R2 


0.06 


0.01 


0.03 



Note: Voluntary absences include self-treated illness and personal days. See Data Appendix for more 
details. All specifications include covariates listed in the notes to Table 6. 






Table 10: Student Outcomes and Prior Experience as Assistant Principal or Teacher in Current School 





Math Test Scores 


English Test Scores 


Student Absences 


Suspensions 
Per 100 Students 


First 2 Years as Principal: 
















Ever Assistant Principal in Current School 


0.024* 




0.026* 




-0.149 




-0.007* 




(0.010) 




(0.008) 




(0.101) 




(0.003) 


Years as Assistant Principal in Current School 




0.003* 




0.001 




-0.012 


-0.001 






(0.001) 




(0.001) 




(0.018) 


(0.001) 


Ever Teacher in Current School 


-0.012 




-0.017+ 




-0.109 




-0.009* 




(0.013) 




(0.010) 




(0.132) 




(0.005) 


Years as Teacher in Current School 




- 0.000 




-0.001 




-0.008 


-0.001* 






(0.001) 




(0.001) 




(0.010) 


(0.000) 


Years 3+ as Principal: 
















Ever Assistant Principal in Current School 


-0.015 




-0.010 




-0.016 




-0.001 




(0.011) 




(0.008) 




(0.119) 




(0.003) 


Years as Assistant Principal in Current School 




-0.002 




-0.002 




-0.001 


-0.001 






(0.001) 




(0.002) 




(0.021) 


(0.000) 


Ever Teacher in Current School 


0.002 




-0.002 




-0.100 




-0.005 




(0.014) 




(0.010) 




(0.152) 




(0.004) 


Years as Teacher in Current School 




- 0.000 




0.000 




-0.001 


- 0.000 






(0.001) 




(0.001) 




(0.011) 


(0.000) 


Observations 


3,690,658 


3,690,658 


3,367,302 


3,367,302 


3,851,268 


3,851,268 


3,851,268 3,851,268 


Adjusted R2 


0.34 


0.34 


0.34 


0.34 


0.15 


0.15 


0.04 0.04 



Note: The unit of observation is a student-year. All regressions include school fixed effects, year-grade fixed effects, controls for students' grade level, ethnicity, gender, 
grade repetition, and participation in free lunch, English language learner, and special education programs, and school-year cell averages of the student level controls and 
class size. Coefficients on individual student characteristics are allowed to differ by grade level. Other time-varying and time-invariant principal characteristics shown in 
Table 1 (other than AP or teacher experience) are also included as covariates. 






Table 11: Student Achievement and Aspiring Principals Program Transitions 





Math Test 
Scores 


English 
Test Scores 


Student 

Absences 


Suspensions 
Per 100 
Students 


Aspiring Principals Program (APP) 
2 Years Prior to Entrance 


0.002 


-0.026* 


0.069 


0.003 




(0.014) 


(0.013) 


(0.192) 


(0.005) 


1 Year Prior to Entrance 


-0.015 


-0.033* 


-0.042 


0.001 




(0.016) 


(0.016) 


(0.213) 


(0.006) 


1st Year as Principal 


-0.044* 


-0.052* 


0.155 


0.025* 




(0.019) 


(0.017) 


(0.207) 


(0.009) 


2nd Year as Principal 


-0.058* 


-0.069* 


-0.079 


0.028* 




(0.029) 


(0.019) 


(0.248) 


(0.011) 


3rd Year as Principal 


-0.045 


-0.030 


-0.227 


0.037* 




(0.035) 


(0.022) 


(0.356) 


(0.016) 


Transition to Non-APP Principal 
2 Years Prior to Entrance 


-0.007 


-0.004 


0.008 


0.002 




(0.006) 


(0.005) 


(0.081) 


(0.003) 


1 Year Prior to Entrance 


-0.012 


-0.006 


0.104 


0.006* 




(0.008) 


(0.007) 


(0.094) 


(0.003) 


1st Year as Principal 


-0.023+ 


-0.013 


-0.200 


0.013* 




(0.013) 


(0.010) 


(0.134) 


(0.006) 


2nd Year as Principal 


-0.031+ 


-0.022+ 


-0.041 


0.015* 




(0.016) 


(0.013) 


(0.153) 


(0.007) 


3rd Year as Principal 


-0.015 


-0.017 


-0.238 


0.012 




(0.020) 


(0.016) 


(0.173) 


(0.009) 


School Fixed Effects 


Y 


Y 


Y 


Y 


Observations 


3,690,658 


3,367,302 


3,851,268 


3,851,268 


Adjusted R2 


0.34 


0.34 


0.15 


0.04 



Note: The unit of observation is a student-year. In addition to all principal credentials shown in Table 6, all 
regressions include year-grade fixed effects, controls for students' grade level, ethnicity, gender, grade repetition, 
and participation in free lunch, English language learner, and special education programs, and school-year cell 
averages of the student level controls and class size. Coefficients on individual student characteristics are allowed 
to differ by grade level. Standard errors are clustered by school. 






Table 12: Student Outcomes and Time-Varying Principal Credentials 





Math 

Test Scores 


English 
Test Scores 


Absences 


Suspensions 

/100 


Cahn Fellow (Post-selection) 


0.028 


0.024 


-0.325+ 


0.009 




(0.017) 


(0.018) 


(0.170) 


(0.009) 


1 Year as Principal 


0.002 


-0.001 


-0.070 


0.002 




(0.004) 


(0.004) 


(0.050) 


(0.002) 


2 Years as Principal 


0.014* 


0.003 


-0.116 


0.004 




(0.007) 


(0.006) 


(0.072) 


(0.002) 


3 Years as Principal 


0.022* 


0.001 


-0.130 


0.001 




(0.009) 


(0.008) 


(0.096) 


(0.003) 


4 Years as Principal 


0.029* 


0.011 


-0.257* 


0.002 




(0.012) 


(0.011) 


(0.118) 


(0.004) 


5 or More Years as Principal 


0.030+ 


0.014 


-0.244 


-0.001 




(0.016) 


(0.014) 


(0.149) 


(0.004) 


Joint Test of Principal Experience (p-value) 


0.05 


0.250 


0.260 


0.100 


Principal-School Fixed Effects 


Y 


Y 


Y 


Y 


Observations 


3,690,658 


3,367,302 


3,851,268 


3,851,268 


Adjusted R2 


0.35 


0.34 


0.15 


0.05 



Note: The unit of observation is a student-year. All regressions include year-grade fixed effects, controls for 
students' grade level, ethnicity, gender, grade repetition, and participation in free lunch, English language 
learner, and special education programs, and school-year cell averages of the student level controls and class size. 
Coefficients on individual student characteristics are allowed to differ by grade level. Standard errors are 
clustered by school. 
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